A novel framework of text-independent speaker verification based on utterance transform and iterative cohort modeling

نویسندگان

  • Ming Liu
  • Huazhong Ning
  • Thomas S. Huang
  • Zhengyou Zhang
چکیده

A novel framework for text-independent speaker verification is proposed. The framework is based on a new interpretation of Universal Background Model. The UBM in our framework actually defines a transform which maps the variable length observation into a fixed dimensional supervector(supervector space). Each speech utterance is then mapped into a point in this supervector space. The similarity measure in this vector space is progressively refined via an iterative cohort modeling scheme. The experiments on NIST 2002 corpus show the effectiveness of this new framework. Overall the EER drops from the baseline system(with TNorm) 9.21% to final improved system(without T-Norm) 8.07%. The new framework can effectively reduce the data dependence in the final output score which is clearly indicated in the second sets of experiments. The EER after T-Norm of final system marginally increases by relatively 1.73% compared to the EER of baseline system drops 16.12% relatively after T-Norm. Also, the relative improvement of DCF after T-Norm is marginal for the final improved system (2.47%) compared to 33.68% in baseline system. It clear shows that the iterative cohort modeling effectively reduce the data dependence of the final scores, so that T-Norm will not further improve the system performance. Also, the performance of novel frame clearly increases as the iteration grows which suggest that the framework progressively refine the similarity measure on the supervector space with the iterative cohort modeling.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Novel Framework of Text-independen Utterance Transform and Itera

A novel framework for text-independent speaker verification is proposed. The framework is based on a new interpretation of Universal Background Model. The UBM in our framework actually defines a transform which maps the variable length observation into a fixed dimensional supervector(supervector space). Each speech utterance is then mapped into a point in this supervector space. The similarity ...

متن کامل

Text-Independent Speaker Verification via State Alignment

To model the speech utterance at a finer granularity, this paper presents a novel state-alignment based supervector modeling method for text-independent speaker verification, which takes advantage of state-alignment method used in hidden Markov model (HMM) based acoustic modeling in speech recognition. By this way, the proposed modeling method can convert a text-independent speaker verification...

متن کامل

Successive cohort selection (SCS) for text-independent speaker verification

A novel cohort selection method, namely, successive cohort selection (SCS) is presented in this paper for text-independent speaker verification. The proposed method computes distance between two models directly and it selects new cohort member based on both the claimed speaker model and the existing cohort members. In addition to this new cohort selection method, we also propose a new score mea...

متن کامل

Model selection and score normalization for text-dependent single utterance speaker verification

In this paper, we investigate model selection and channel variability issues on a text-dependent single utterance (TDSU) speaker verification application. Due to the lack of an appropriate database for the task, a multichannel speaker recognition database, which consists of multiple recordings of a single Turkish utterance, is collected. The first set of experiments is devoted to model selectio...

متن کامل

Unsupervised learning of HMM topology for text-dependent speaker verification

Usually, text-dependent speaker verification can achieve better performance than text-independent system because of the constraint that the enrollment and testing utterance share the same phonetic content. However, the enrollment data for text-dependent system usually is very limited. Expectation Maximization(EM) training of HMM will suffer from noisy estimation because of limited enrollment. A...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006